2,003 research outputs found
Sampling-Based Query Re-Optimization
Despite of decades of work, query optimizers still make mistakes on
"difficult" queries because of bad cardinality estimates, often due to the
interaction of multiple predicates and correlations in the data. In this paper,
we propose a low-cost post-processing step that can take a plan produced by the
optimizer, detect when it is likely to have made such a mistake, and take steps
to fix it. Specifically, our solution is a sampling-based iterative procedure
that requires almost no changes to the original query optimizer or query
evaluation mechanism of the system. We show that this indeed imposes low
overhead and catches cases where three widely used optimizers (PostgreSQL and
two commercial systems) make large errors.Comment: This is the extended version of a paper with the same title and
authors that appears in the Proceedings of the ACM SIGMOD International
Conference on Management of Data (SIGMOD 2016
Ultra-Low-Power Superconductor Logic
We have developed a new superconducting digital technology, Reciprocal
Quantum Logic, that uses AC power carried on a transmission line, which also
serves as a clock. Using simple experiments we have demonstrated zero static
power dissipation, thermally limited dynamic power dissipation, high clock
stability, high operating margins and low BER. These features indicate that the
technology is scalable to far more complex circuits at a significant level of
integration. On the system level, Reciprocal Quantum Logic combines the high
speed and low-power signal levels of Single-Flux- Quantum signals with the
design methodology of CMOS, including low static power dissipation, low latency
combinational logic, and efficient device count.Comment: 7 pages, 5 figure
Measuring co-authorship and networking-adjusted scientific impact
Appraisal of the scientific impact of researchers, teams and institutions
with productivity and citation metrics has major repercussions. Funding and
promotion of individuals and survival of teams and institutions depend on
publications and citations. In this competitive environment, the number of
authors per paper is increasing and apparently some co-authors don't satisfy
authorship criteria. Listing of individual contributions is still sporadic and
also open to manipulation. Metrics are needed to measure the networking
intensity for a single scientist or group of scientists accounting for patterns
of co-authorship. Here, I define I1 for a single scientist as the number of
authors who appear in at least I1 papers of the specific scientist. For a group
of scientists or institution, In is defined as the number of authors who appear
in at least In papers that bear the affiliation of the group or institution. I1
depends on the number of papers authored Np. The power exponent R of the
relationship between I1 and Np categorizes scientists as solitary (R>2.5),
nuclear (R=2.25-2.5), networked (R=2-2.25), extensively networked (R=1.75-2) or
collaborators (R<1.75). R may be used to adjust for co-authorship networking
the citation impact of a scientist. In similarly provides a simple measure of
the effective networking size to adjust the citation impact of groups or
institutions. Empirical data are provided for single scientists and
institutions for the proposed metrics. Cautious adoption of adjustments for
co-authorship and networking in scientific appraisals may offer incentives for
more accountable co-authorship behaviour in published articles.Comment: 25 pages, 5 figure
A grid-based infrastructure for distributed retrieval
In large-scale distributed retrieval, challenges of latency, heterogeneity, and dynamicity emphasise the importance of infrastructural support in reducing the development costs of state-of-the-art solutions. We present a service-based infrastructure for distributed retrieval which blends middleware facilities and a design framework to ‘lift’ the resource sharing approach and the computational services of a European Grid platform into the domain of e-Science applications. In this paper, we give an overview of the DILIGENT Search Framework and illustrate its exploitation in the field of Earth Science
International ranking systems for universities and institutions: a critical appraisal
<p>Abstract</p> <p>Background</p> <p>Ranking of universities and institutions has attracted wide attention recently. Several systems have been proposed that attempt to rank academic institutions worldwide.</p> <p>Methods</p> <p>We review the two most publicly visible ranking systems, the Shanghai Jiao Tong University 'Academic Ranking of World Universities' and the Times Higher Education Supplement 'World University Rankings' and also briefly review other ranking systems that use different criteria. We assess the construct validity for educational and research excellence and the measurement validity of each of the proposed ranking criteria, and try to identify generic challenges in international ranking of universities and institutions.</p> <p>Results</p> <p>None of the reviewed criteria for international ranking seems to have very good construct validity for both educational and research excellence, and most don't have very good construct validity even for just one of these two aspects of excellence. Measurement error for many items is also considerable or is not possible to determine due to lack of publication of the relevant data and methodology details. The concordance between the 2006 rankings by Shanghai and Times is modest at best, with only 133 universities shared in their top 200 lists. The examination of the existing international ranking systems suggests that generic challenges include adjustment for institutional size, definition of institutions, implications of average measurements of excellence versus measurements of extremes, adjustments for scientific field, time frame of measurement and allocation of credit for excellence.</p> <p>Conclusion</p> <p>Naïve lists of international institutional rankings that do not address these fundamental challenges with transparent methods are misleading and should be abandoned. We make some suggestions on how focused and standardized evaluations of excellence could be improved and placed in proper context.</p
The influence of feature selection methods on accuracy, stability and interpretability of molecular signatures
Motivation: Biomarker discovery from high-dimensional data is a crucial
problem with enormous applications in biology and medicine. It is also
extremely challenging from a statistical viewpoint, but surprisingly few
studies have investigated the relative strengths and weaknesses of the plethora
of existing feature selection methods. Methods: We compare 32 feature selection
methods on 4 public gene expression datasets for breast cancer prognosis, in
terms of predictive performance, stability and functional interpretability of
the signatures they produce. Results: We observe that the feature selection
method has a significant influence on the accuracy, stability and
interpretability of signatures. Simple filter methods generally outperform more
complex embedded or wrapper methods, and ensemble feature selection has
generally no positive effect. Overall a simple Student's t-test seems to
provide the best results. Availability: Code and data are publicly available at
http://cbio.ensmp.fr/~ahaury/
Extended H? emission line sources from UWISH2
We present the extended source catalogue for the UKIRT Wide Field Infrared Survey for H2 (UWISH2). The survey is unbiased along the inner Galactic Plane from l ? 357° to l ? 65° and |b| ? 1.5° and covers 209 deg2. A further 42.0 and 35.5 deg2 of high dust column density regions have been targeted in Cygnus and Auriga. We have identified 33 200 individual extended H2 features. They have been classified to be associated with about 700 groups of jets and outflows, 284 individual (candidate) planetary nebulae, 30 supernova remnants and about 1300 photodissociation regions. We find a clear decline of star formation activity (traced by H2 emission from jets and photodissociation regions) with increasing distance from the Galactic Centre. About 60 per cent of the detected candidate planetary nebulae have no known counterpart and 25 per cent of all supernova remnants have detectable H2 emission associated with them
- …